Aligning phonetic transcriptions with their citation forms

نویسنده

  • Keith Johnson
چکیده

One of the main motivations for publishing this paper is to make available a matrix of phone-distance measures which may be useful in dealing with large corpora of conversational speech. The paper reports how this matrix of phone-distances was created from transcriber labeling disagreements, and how it can be used in a dynamic time warping algorithm to align phonetic transcriptions of conversational speech with their citation forms. The weighted string edit distance produced by the phone-distance DTW algorithm may also be useful in calculating neighborhood densities for studies of auditory word recognition. ©2003 Acoustical Society of America PACS numbers: 43.72.Lc, 43.72.Ne

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic reduction in conversational Dutch: A quantitative analysis based on automatically generated segmental transcriptions

In spontaneous, conversational speech, words are often reduced compared to their citation forms, such that a word like yesterday may sound like [’jESeI]. The present paper investigates such acoustic reduction . The study of reduction needs large corpora that are transcribed phonetically. The first part of this paper describes an automatic transcription procedure used to obtain such a large phon...

متن کامل

A corpus-based analysis of Korean segments produced by Japanese learners

This paper examines variations of Korean segments produced by Japanese learners of Korean. For corpus-based statistical analysis, we have used Korean read speech corpus produced by Japanese learners. Contrastive analysis of the target language and the source language is performed to provide information for interpreting the results of corpus analysis. Segmental variations are analyzed by alignin...

متن کامل

Applying speech verification to a large data base of German to obtain a statistical survey about rules of pronunciation

In this paper we present a new research project to obtain a statistical survey of the pronunciation of German using an automatic system for segmentation and labeling of speech data and a very large data base of spoken German (GermAn Spoken in Public, GASP). It mainly involves the development of two components: a) An automatic system of speech veriication (PHONSEG) which produces a seg-mentation...

متن کامل

Validation of phonetic transcriptions in the context of automatic speech recognition

Some of the speech databases and large spoken language corpora that have been collected during the last fifteen years have been (at least partly) annotated with a broad phonetic transcription. Such phonetic transcriptions are often validated in terms of their resemblance to a handcrafted reference transcription. However, there are at least two methodological issues questioning this validation m...

متن کامل

Application-oriented validation o preliminary r

There is an increasing need for automatic procedures to generate and validate phonetic transcriptions. As the production of manual phonetic transcriptions tends to be time-consuming, error-prone and costly, procedures have been developed to derive phonetic transcriptions automatically by means of automatic speech recognition technology. Such automatic phonetic transcriptions are usually validat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003